LIBRARY 

NAVAI 
MONTEREi 


NPS55-79-01 


NAVAL  POSTGRADUATE  SCHOOL 

Monterey,  California 


ESTIMATING  ERRORS  IN 
STUDENT  ENROLLMENT  FORECASTING 


by 


K.  T.  Marshall 

and 
R.  M.  Oliver 

January  19  79 


Approved  for  public  release;  distribution  unlimited 


FEDDOCS 
D  208.14/2: 
NPS-55-79-01 


NAVAL  POSTGRADUATE  SCHOOL 
MONTEREY,  CALIFORNIA 

Rear  Admiral  T.  F.  Dedman  J.  R.  Borsting 

Superintendent  Provost 


Reproduction  of  all  or  part  of  this  report  is  authorized. 
This  report  was  prepared  by: 


UNCLASSIFIED 


SECURITY   CLASSIFICATION   OF    THIS   PAGE   (When  Data  Entered) 


REPORT  DOCUMENTATION  PAGE 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


1.     REPORT   NUMBER 

NPS55-79-01 


[2.   GOVT    ACCESSION   NO. 

fa?-A0>7fl6»f 


3.      RECIPIENT'S  CATALOG   NUMBER 


4.     TITLE  (and  Subtitle) 

Estimating  Errors    in   Student  Enrollment 
Forecasting 


5.     TYPE   OF   REPORT   4    PERIOD   COVERED 

Technical 


6.  PERFORMING  ORG.  REPORT  NUMBER 


7.   AUTHORS 

K.  T.  Marshall  and  R.  M.  Oliver 


8.  CONTRACT  OP  GRANT  NUMBERC.) 


9.  PERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

Naval  Postgraduate  School 
Monterey,  Ca.  9  3940 


10.     PROGRAM   ELEMENT.  PROJECT.    TASK 
AREA   4    WORK   UNIT   NUMBERS 


11.     CONTROLLING  OFFICE   NAME    *NO    ADDRESS 

Naval  Postgraduate  School 
Monterey,  Ca.  93940 


12.     REPORT   OATE 

January   1979 


13.     NUMBER   OF   PAGES 

26 


14.     MONITORING   AGENCY   NAME   &    AODRESSf//  dilterent  from   Controlling  Office) 


IS.     SECURITY   CLASS,   (ol  thta  report) 

Unclassified 


ISa.     DECLASSIFICATION/  DOWNGRADING 
SCHEDULE 


16.      DISTRIBUTION    STATEMENT  fof  this  Report) 

Approved    for  public   release;    distribution    unlimited. 


'7.     DISTRIBUTION   STATEMENT  ''of  the  abstract  entered  in  Block  20,    It  different  from  Report) 


18.     SUPPLEMENTARY   NOTES 


19.     KEY   WORDS  (Continue  on  reverse  side  II  necessary  and  Identity  by  block  number) 

Student  Enrollment 
Forecasting 
Confidence  Bounds 
Education 
Planning 


20.      ABSTRACT   'Continue  on  reverae  aide  It  necessary  and  Identity  by  block  number) 

The   purpose   of   this  paper   is   to   demonstrate   how   longitudinal   data   can   be    used 

to   determine   variances,    and  hence    confidence   bounds,    on    student  enrollment 

forecasts    in   addition    to    finding   the    forecasts   themselves.      The    cases   of  known 

admission   numbers   and   unknown    admission   numbers,    but  with   an    assumed  Poisson 

distribution,    are   both   considered.      The   model    takes    into   account   different 

admissions   at    fall   and   spring   semesters,    and  also   allows    for   differences    in 

the    continuation    fractions    for   these    different   semesters.      Normal 

OVER 


DD     1    janM73     1473  EDITION   OF    1    NOV  65  IS  OBSOLETE 

5/N    0 102-014- 6601 


SECURITY   CLASSIFICATION  OF   THIS  PAGE   'When  Data  Sntarad) 


-i-UUHITY   CLASSIFICATION   OF   THIS  PAGEfWhrn  Data  Entered) 


20.   Abstract 

approximations  are  used  to  calculate  the  probability  that  a  total  enrollment 
lies  in  a  given  interval.   Numerical  examples  illustrate  the  results. 


SECURITY   CLASSIFICATION   OF   THIS  P  HGE(When  Data  Enter 


ESTIMATING  ERRORS  IN  STUDENT  ENROLLMENT  FORECASTING 


fK.  T.  Marshall 


and 
R.  M.  Oliver 


'Naval  Postgraduate  School,  Monterey,  Ca 
TUniversity  of  California,  Berkeley,  Ca. 


DEDICATION 

This  paper  is  dedicated  to  the  memory  of  Sidney  Suslow, 
a  founding  member  of  the  Association  of  Institutional  Research, 
and  a  man  whose  constant  energy  went  into  the  support  of  its 
purposes  and  goals.   It  was  with  Sid's  support  and  encourage- 
ment that  we  pursued  our  interest  in  higher  educational 
planning,  and  his  pioneering  work  in  obtaining  longitudinal 
data  on  students  led  directly  to  our  work  in  the  study  of 
longitudinal  models. 


0 .   Introduction 

In  1968  Sidney  Suslow,  together  with  his  colleagues  in 
the  Office  of  Institutional  Research  at  the  Berkeley  Campus  of 
the  University  of  California,  completed  a  study  (Suslow  et  al . 

[4])  of  undergraduate  student  attendance  patterns  over  time. 
That  report  contains  some  of  the  earliest  data  the  authors  had 
seen  on  a  given  group,  or  cohort,  of  students,  and  how  the  group 
behaved  over  its  undergraduate  career.   Most  institutions  keep 
only  cross-sectional  data  obtained  from  enrollment  statistics. 
It  was  the  availability  of  the  Suslow  data  that  led  the  authors 
to  pursue  the  formulation  and  analysis  of  enrollment  models 
based  on  longitudinal  student  attendance  patterns.   The  authors 
presented  a  constant-work  model  (Marshall  and  Oliver  [2])  which 
explained  the  data  quite  successfully.   They  also,  together 
with  Suslow  in  [3],  tried  to  find  cross-sectional  Markovian 
models  to  fit  the  longitudinal  data  (this  latter  work  is  repro- 
duced in  a  shortened  form  in  Chapter  2  of  Grinold  and  Marshall 

[1],  which  is  perhaps  more  accessible  than  [3]). 

The  purpose  of  this  paper  is  to  demonstrate  how  the 
longitudinal  data  can  be  used  to  determine  variances,  and  hence 
confidence  bounds,  on  student  enrollment  forecasts  in  addition 
to  finding  the  forecasts  themselves.   Thus  with  each  forecast 
we  have  a  measure  of  the  error  that  could  be  present. 


1.   Model  Formulation 

We  consider  discrete  points  in  time  such  as  the  beginning 
of  a  quarter,  semester,  or  academic  year.   The  particular  choice 
depends  on  the  model  use  and  the  availability  of  data.   In  our 
numerical  examples  we  use  the  data  from  Suslow  et  al.  [4],  and 
hence  our  time  points  coincide  with  semesters.   Thus  when  we  write 
t  =  1,2,3,...,  we  mean  the  start  of  the  first,  second,  third,  etc. 
semesters  in  the  future;  t  =  0   will  refer  to  the  point  "now"  from 
which  forecasts  are  being  made,  and   t  =  -1,  -2,  -3,  will  refer  to 
the  first,  second,  third,  etc.  semesters  in  the  past. 

Our  first  aim  is  to  derive  an  expression  for  the  expected 
number  in  attendance  at  some  time   t  >  0.   We  do  not  differentiate 
groups  such  as  freshmen,  sophomores,  or  lower  division,  upper 
division.   This  could  easily  be  done  by  placing  subscripts  on  our 
notation,  but  we  choose  to  simplify  the  notation  to  be  consistent 
with  the  Suslow  data  on  total  student  attendance. 

Let   S(t;u)   be  the  number  of  students  in  attendance  at 
time   t  who  entered  (for  the  first  time)  at  time   t  -  u, 
u  =  0,1,...  .   Let   S(t)   be  the  total  number  of  students  in 
attendance  at  time   t.   Then 

S(t)  =  S(t;0)  +  S(t;l)  +  S(t;2)  +  ••■•  +  S(t;u)  +  •••  .         (1) 

The  data  in  [4]  showed  that  for  the  periods  studied 
(1950' s  and  1960's)  there  was  very  stable  behavior  in  student 
attendance;  the  fraction  of  students  who  attended  a  given  semester 


after  entrance  was  independent  of  when  the  students  first  entered, 
However,  only  fall-entering  cohorts  were  studied.   We  assume  here 
that  stable  behavior  could  be  expected  from  spring-entering 
cohorts  also,  but  that  fall-  and  spring-entering  students  could 
have  different  continuation  fractions.   Let  p  (u)   be  the  prob- 
ability that  a  student  attends  at  time   u   after  entering  in  the 
fall,  independent  of  the  particular  entrance  time.   Let  p?(u) 
be  equivalent  probability  for  spring-entering  students.   We  also 
assume  that  the  attendance  of  any  given  student  is  independent 
of  the  attendance  or  non-attendance  of  any  other  student;  i.e. 
all  students  act  independently  of  each  other.   Table  1  gives 
p. (u)   determined  by  Suslow  et  ai .  in  [4]. 

Let   N(t)   be  the  number  of  new  students  who  enter  at 
time   t.   The  above  two  assumptions  imply  that  the  value  of  (the 
random  variable)   S(t;u) ,  given  the  value  of   N(t-u) ,  has  a 
binomial  probability  distribution.   That  is, 


Pr[S(t;u)  =  klN(t-u)  =  m]  =  (™)  p.(u)k  [l-p.(u)]mk  ,      (2) 


for   k  =  0,1,..., m,  and  n  _>  0 ,  where   i  =  1   for  fall  students 
and   i  =  2   for  spring  students.   In  particular  the  conditional 
expectation  and  the  conditional  variance  of   S(t;u)   are  given 
respectively  by 

E  [S(t;u)  |N(t-u)  =  m]  =mp.(u)  ,  (3! 


Var [S (t;u)  |N(t-u)  =m]  =  mp .  (u)  [1  -  p .  (u)  ]  .    (4) 


u 

P1(n) 

Pr(.U)   (1  -  p1(u)  ) 

P1(u) 

0 

1.0 

0.0 

1.0 

1 

.972 

.0272 

.9448 

2 

.905 

.0860 

.8190 

3 

.756 

-1845 

.5715 

4 

.684 

.2161 

.4679 

5 

.593 

.2414 

.3516 

6 

.562 

.2462 

.3158 

7 

.524 

.2494 

.2746 

8 

.498 

.2500 

.2480 

9 

.199 

.1594 

.0396 

10 

.130 

.1131 

.0169 

11 

.050 

.0475 

.0025 

12 

.036 

.0347 

.0013 

13 

.017 

.0167 

.0003 

14 

.015 

.0148 

.0002 

15 

.011 

.0109 

.0001 

16 

.007 

.0070 

.0000 

6.959 

1.905 

5.054 

TABLE  1:   Sample  student  attendance  data  from  Suslow  et  al .  [4] 


Let   t   be  the  start  of  a  fall  semester.   After  taking 
expectations  in  (1)  and  using  (3),  the  expected  total  enrollment 
at  time   t   is 


E[S(t)  ]  =  I      p.  .    .   (u)  E[N(t-u)  ]  .  (5) 

u=0  ±{U> 


Here  we  have  let 

i(u)  =1   if   u  =  0,2,4,6,  ... 

=2   if   u=  1,3,5,7,  ...  . 

For  any  two  random  variables   X  and  Y   the  expression 

Var[X]  =  E[Var[x|Y]]  +  Var[E[XlY]] 

holds.   We  use  this  together  with  (1),  (3)  and  (4)  to  obtain  for 
the  variance  of  the  total  enrollment  at  time  t, 

~  f 

Var[S(t)]  =       E[N(t-u)]  p .  (u)  (u)  (1  -  Pj_  (i)  (u)  ) 


+  ?l(u) (u) 2  Var(N(t-u) ) 


Equations  (5)  and  (6)  give  the  expected  enrollment  and 
its  variance  at  time   t.   Recall  that   t   is  a  fall  semester. 
For  the  case  when   t   is  a  spring  semester  we  use 

i(u)  =2   if   u  =  0,2,4,6,  ... 

=1   if   u  =  1,3,5,7,  ...  . 


These  expressions  do  not  take  into  account  the  fact 
that  we  have  knowledge  of  enrollments  up  to  time   t  =  0   (the 
current  time  in  our  timing  convention) .   In  (5)  we  know  the 
values  of   N(0),  N(-l),  N(-2),  etc.  and  thus  our  forecast  for 
t  >  0   becomes 

E[S(t)  |N  (0)  ,N(-1)  ,  ..  .] 

t-1      ,  (7) 

=  I      p     (u)  N(t-u)  +  I      Pi(u)(u)  BCN(t-u)]  , 

u=t    ^    '  u=0  ±K    ' 

where   i(u)   is  defined  above  for  the  particular  case  that'  t 
is  either  fall  or  spring.   The  first  summation  term  in  equation 
(7)  gives  the  expected  "legacy"  at  time  t  of  the  given  inputs 
up  to  and  including  the  current  time  zero.   The  second  summation 
gives  the  expected  enrollment  at  time  t  from  the  expected  input 
of  new  students  at  times   1,  2,  ...  ,  t. 

Similarly,  by  using  equation  (6) ,  the  variance  of  the 
forecast  at   t,  given  inputs  up  to  and  including  time  zero, 
becomes 

Var[S(t)  I N  ( 0 ) ,N(-1)  , .. .] 

oo 

=  I      p.  ,  .  (u)  (1  -  p.  ,  .  (u)  )  N(t-u) 
u=t 

t-1  2  \ 

+  I         Pi(u)  (u)  (1-Pi(u)  (u))  E[N(t-u)]  +pi(u)  (u)*  Var(N(t-u))J 

(8) 

The  first  summation  gives  the  contribution  to  the  variance  from 
the  inputs  up  to  and  including  the  present.   The  second  summation 


gives  the  contribution  which  will  occur  from  future  inputs.  Note 
that  this  depends  on  the  variance  of  the  new  inputs  for  times 
l,2,...,t   as  well  as  the  variance  due  to  returning  students. 

Table  1  gives  data  for   p, (u) ,  u    0,  obtained  originally 
in  the  study  for  Suslow  et  al.  [4],  and  reproduced  on  page  66  of 

[1] .   The  third  and  fourth  columns  give   p. (u) (1-p, (u) )   and 

2 

p, (u)    respectively.   These  data  are  required  in  equation  (8j  , 

whereas  the  data  in  column  2  are  required  in  equation  (7) . 

The  usual  interpretation  given  to  the  second  column  in 
Table  1  is  simply  the  fraction  of  attending  students  out  of  a 
given  cohort.   The  third  column  is  the  variance  of  the   5(t;u) 
terms   divided  by   N(t-u) .   It  is  interesting  to  see  how  the 
conditional  expectation  and  the  variance  of  the  number  of  attend- 
ing students  vary  with  the  number  of  time  periods  that  have 
elapsed  since  initial  registration.   As  one  might  expect,  the 
fraction  of  students  out  of  a  given  cohort  that  return  to  attend 
decreases  rapidly  and  there  is  a  sharp  drop  of  attendance  after 
eight  semesters.   By  the  end  of  the  12th  semester  die  fraction 
of  attending  students  decreases  to  a  number  less  than  4%  of 
the  original  cohort.   However,  the  conditional  variance  of  the 
number  returning  first  increases,  has  its  maximum  when  seven  or 
eight  semesters  have  elapsed  and  then  decreases  to  a  negligible 
amount  by  the  end  of  the  12th  semester.   About  the  12th  semester, 
the  conditional  expectation  and  variance  of  the  number  attending 


are  about  equal;  this  result  is  not  surprising,  if  we  recall 
that  the  Poisson  distribution  (whose  variance  and  mean  are  equal) 
is  a  good  approximation  to  the  binomial  distribution  when  the 
probability   p(u)   is  small.   Thus,  students  returning  after 
10  periods  can  be  classified  as  "rare"  events  in  the  sense  that 
while  the  probability  that  an  individual  student  attends  is 
small  the  original  cohort  is  large  enough  so  that  the  probability 
distribution  of  returning  students  is  Poisson.   By  similar 
arguments  one  can  deduce  that  the  number  who  do  not  attend  in 
the  first  few  semesters  is  also  Poisson  distributed. 

Consider  a  simple  system  where  there  is  no  variance  in 
the  new  student  input,  which  is  a  fixed  amount,  say  n. ,  in  each 
fall  semester,  and  a  fixed  amount   n-   in  each  spring  semester. 
Thus   E[N(t)]  =  n.   and   Var[N(t)]  =  0   for  all   t   where   i  =  1 
for  a  fall  semester  and   i  =  2   in  the  spring.   Using  these  in 
(7)  and  (8) ,  and  assuming   p, (u)  =  p„ (u)   with  the  data  in  Table  1 
we  obtain 


E[S(t)]  =  3.873n1  +  3.122n2  ,     Var[S(t)]  =  0.968n   +  0.937n2 


for   t   a  fall  semester,  and 


E[S(t)]  =  3.873n2  +  3.122n  ,      Var[S(t)]  =  0.968n2  +  0.937n 


for   t   a  spring  semester.   All  these  expressions  are  independent 
of   t  because  of  the  constant  input  each  period. 
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Table  2  illustrates  the  use  of  these  equations  for  three 
combinations  of  fall  and  spring  input  totalling  4000  per  year, 
and  assuming   {p, (u)  =  p~(u)}   are  given  in  Table  1. 


Semester 

Input 

Expected 
Enrollment 

Variance  of 
Enrollment 

Fall 
Spring 

Fall 
Spring 

Fall 
Spring 

4,000 

0 

15,348 
12,488 

3,372 

3,748 

3,000 
1,000 

14,633 
13,203 

3,341 
3,  779 

2,000 
2,000 

13,918 
13,918 

3,310 
3,810 

TABLE  2:   Illustrative  calculations  for  differing  fall/spring 
input  values. 


A  fairly  typical  use  for  Equations  (7)  and  (3)  is  that 
of  forecasting  one  period  into  the  future.   With  the  convention 
that   t  =  0   represents  today  (the  start  of  a  fall  semester) , 
we  obtain  the  next  period  forecast 


E[S(1)|N(0),  N(-l),...]  »  I       p  (u)  (u)  N(l-u)  +E[N(1)] 

u=l 


with   i(u)  =  1   for   u   even,  i(u)  =  2   for  u  odd,  and 
provided  p,  (0)  =  1.   The  first  (summation)  terms  represents  the 
expected  number  of  returning  students  and  the  second  term  repre- 
sents the  expected  number  of  new  admissions.   The  corresponding 
expression  for  the  variance  of  enrollments  in  the  next  period  is 

Var[S(l)  |N(0)  ,N(-1)  ,..  .]  =  I      p±  (u)  (u)  (1  -  p±  (u)  (u)  )  N(l-u)+Var[N 

In  this  case  where  we  assume  all  entering  students  in  fact 
show  up,  the  fluctuations  are  due  either  to  the  uncertainty 
in  the  count  of  returning  students  already  enrolled  or  to  the 
uncertainty  in  the  new  students.   Thus  one  can  obtain  some  idea 
of  where  new  forecasting  efforts  should  be  directed.   In  certain 
institutions  the  dominant  problem  may  be  the  uncertainties 
associated  with  returning  students  rather  than  with  new  students. 
If,  for  example,  the  past  cohorts  were  approximately  3000  in 
each  fall  and  1000  in  each  spring,  but  the  next  group  of  enter- 
ing students  were  Poisson  with  expected  number  and  variance 
equal  to  100  0  then  we  would  have  (from  Table  2) 

Var[S[l] |N(0) ,N(-1)  ,...]  =  3779  +  1000  =  4779  . 


10 


In  this  case,  two  standard  deviations  (a  measure  of  error  often 
used  and  based  on  Normal  distribution  theory)  would  be  138  students 
which  is  slightly  larger  than  the  value  we  obtain  when  all 
admissions  are  constant   (2  x   /3770  =  122  from  Table  2)  .   In 
other  words  it  is  possible  to  make  various  assumptions  about  the 
uncertainty  of  future  enrollments  and/or  returning  students  and 
easily  include  them  in  our  estimates  of  enrollment  fluctuations. 

It  is  unlikely  that  student  input  each  period  would  be 
constant.   In  the  next  section  we  analyze  the  model  assuming 
that  new  admissions  follow  a  Poisscn  distribution. 


11 


2 .   Poisson  Admissions 

The  number  of  new  students  who  actually  enroll  in  a 
given  future  semester  is  not  known  with  certainty.   A  simple 
method  of  modelling  this  uncertainty  is  to  assume  the  number 
of  new  enrollments  follows  a  Poisson  distribution.   Let  n(t) 
be  the  expected  number  of  new  enrollments  at  time  t.   Then 


Pr[N(t)  =  m]  =  2JL2 |t ,  m  >  0  .  (9. 


From  equations  (2)  and  (9)  we  get 


k   "pi(u)(u)  ni(u)(t"u) 


p.  ,  .  (u)  n.  ,  .  (t-u)   e 
Pr[S(t;u)=k]  =  liiSi iiHJ _ 


This  shows  that  each  random  variable  in  (1)  has  a  Poisson  dis- 
tribution, which  together  with  our  independence  assumption, 
implies  that  the  total  enrollment  at  time  t  has  a  Poisson  dis- 
tribution at  every  time  t,  with 


E[S(t) ]  =  Var [S(t)  ]  =  I      p.  ,  ,  (u)  n.  ,  ,  (t-u) 

u=0   llu;      llu; 


Using  our  previous  example,  but  with  Poisson  input 
instead  of  fixed  input,  with   n,  =  3000,  n„  =  1000   and 
p,  (u)  =  P2(u)   as  i-n  Table  1,  we  get  again  an  expected  enrollment 


12 


of  14,741  each  fall  and  13,239  each  spring,  but  with  variances 
of  the  same  values.   Thus  two  standard  deviations  would  be  2  42 
each  fall  and  230  each  spring,  which  show  much  more  uncertainty 
in  the  forecasts  as  one  would  expect. 


13 


3 .   Large  Cohort  Sizes 

We  have  already  shown  in  equation  (2)  that  the  number 
of  students  attending  out  of  a  given  entering  cohort  can  be  viewe< 
as  the  result  of  summing  successes  in  Bernoulli  trials,  where 
the  probability  of  success  is  the  probability  that  a  student 
attends  on  a  given  semester.   Thus,  if  add  a  finite  number  of 
such  random  variables  to  obtain  the  attendance  at  a  later  time 
period  we  again  obtain  a  sum  of  successes  in  a  finite  number  of 
Bernoulli  trials.   If  the  parameter   p. (u)   of  the  Binomial  dis- 
tribution in  (2)  did  not  change  with  time,  then  it  would  also  be 
true  that  the  sum  in  (1)  is  binomially  distributed.   This  follows 
from  the  derivation  of  the  distribution  of  the  sum  of  successes 
in  a  finite  number  of  Bernoulli  trials,  each  trial  having  the 
same  probability  of  success.   Unfortunately,  that  is  not  the 
case;  as  we  can  easily  see  from  Table  1  the  parameter  p. (u) 
changes  rather  dramatically  with  elapsed  time  since  entry  and 
the  resulting  distribution  is  obtained  from  the  convolution  of 
as  many  binomial  distributions,  with  changing  parameters,  as 
there  are  terms  in  (1) .   Although  explicit  expressions  can  be 
found  for  the  generating  function  of  such  distributions,  alge- 
braic expressions  for  the  distribution  itself  are  not  simple. 
Fortunately,  however,  much  can  be  said  about  the  approximate 
behavior  of  the  conditional  distribution  of   S(t)   if  we  assume 
that  entering  cohorts  contain  large  numbers  of  students. 

The  central  limit  theorem  of  probability  theory  states 
that  if   S(t;u)   is  the  sum  of  the  number  of  successes  in   n(t-u) 
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trials  each  with  success  probability   p. (u) ,  then  the  normalized 

sum 

*  S (t;u)  -  p.  (u)  n(t  -  u) 

S  (t;u)  =  1 TTJ  (11) 

[p  (u)  (1  -  p.  (u))  n(t-u)  }L/  * 


is  approximately  normally  distributed.   If  we  write 


'(a)  .  -L.   /  a"*  /2dv 


for  the  normal  distribution  function,  then  with  large  cohort 
sizes,  i.e.,  large  numbers  entering  at   t-u, 


* 
Pr[S  (t;u)    a]  ~  $(a)   independent  of   p.(u)   and   t.   (12) 


As  long  as  each  entering  cohort  is  large  and  entering  cohorts 
act  independently  of  one  another  the  sum  of  a  finite  number  of 
terms  in  (1)  is  also  approximately  normal.   In  this  case 


Pr[S*(t)  £  a]  ~  $(a)  ,  (13) 

* 
where  the  normalization  for   S  (t)   is  given  by 

S(t)  -   I       p.  ri  v  (u)  n(t-u) 
*                    u  >  0  1^U) 
S  (t)  = ,  l  ,2  (14) 


(u>0  ^^^  Pi(u)(uM1"Pi(u)(u)) 
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Table  3  gives   E[S(t;u)]   for   u  =  0,1,...  ,12   and 
E[S(t)]   together  with  95%  confidence  intervals.   Also  tabulated 
is  the  length  of  the  confidence  intervals  as  a  percentage  of 
the  expected  values.   Fall  and  spring  semesters  are  shown  in 
separate  columns  for  clarity  (again   t   is  assumed  to  be  a  fall 
semester) .   Note  how  the  uncertainty  as  a  percentage  of  the  mean 
increases  with  time  enrolled,  and  how  small  the  error  is  on 
the  total  enrolled  forecast  compared  to  the  individual  semesters 

Equations  (13)  and  (14)  can  be  used  to  obtain  more 
information  on  the  uncertainty  in   S(t);  one  can  estimate  the 
probability  of  the  enrollment  exceeding  any  given  figure,  of 
not  exceeding  any  given  figure,  or  of  being  in  any  given  range. 
Let  a   and  b  be  any  two  numbers  with   a  <  b.   Then  for 
n,  =  3000,  n„  =  1000,  t   a  fall  semester,  and  the  data  given 
in  Table  1  with   p, (u)  =  p~(u) ,  then 


Pr[a  <  S(t)  <  b]  ~  $'b  "  14'633^  -*'a  "  14'633 


62  \  62 


From  tables  of  the  normal  distribution  we  see  that 

P[S(t)  £  14,700]  ~  0.86  , 

P[S(t)    _>    14,500]    ~   0.98  ,  (15) 

P[14,500    <    S(t)     <    14,700]    ~    0.84  . 
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E[S(t;u)  ] 

and 

Confidence  Interval  as 

95%  Confidence 

Interval 

%  of  E[S(t;u)  ] 

Time  u 

Fall 

Spring 

Fall       Spring 

0 

3000  +   0 

0 

1 

972  +  10 

2.1 

2 

2715  +  32 

2.4 

3 

756  +  27 

7.1 

4 

2052  +  51 

• 

5.0 

5 

593  +  31 

10.5 

6 

1686  +  54 

6.4 

7 

524  +  32 

12.2 

3 

1494  +  55 

7.4 

9 

199  +  25 

25.1 

10 

390  +  37 

19.0 

11  ■ 

50  +  14 

56.0 

12 

108  +  20 

37.0 

13 

17+3 

94.1 

14 

"  4  5  +  13 

57.8 

15 

11  +  7 

127.3 

16 

21+9 

35  .7 

Total 

14,633 

+  124 

1.7 

TABLE  3:   Forecasts  and  confidence  intervals  for  each  semester 

enrollment,  n,  =  3000,  n_=1000,  and  t  a  fall  semester 
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The  Normal  approximation  for   S(t)   still  holds  if  the 

admissions  each  semester  are  assumed  to  be  Poisson,  since  the 

total  enrollment  is  the  sum  of  independent  Poisson  random 
variables  with  distribution  given  by  (10) .   In  this  case  we 
consider 


S(t)  ~  I       Piful  (u)  n^'^ 
u  >  0  ±KU) 


S  (t)  = 


(JQ  Pi(u)(u)  n(t"u)) 


1/2 


For  fall  Poisson  inputs  with  mean  3000,  spring  Poisson  inputs 

with  mean  1000,  t  a  fall  semester,  and  assuming  p  (u)  =  p_ (u) 

1       * 

given  in  Table  1,  then 


P[a  <  S(t)  <  b]  ~ 


b  -  14,633 
121 


-  $ 


a  -  14,633 
121 


In  this  case 


P[S(t)  <  14,700]  -  0.71  , 


P[S(t)  >  14,500]  -  0.86  , 
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P[14,500  £  S(t)  £  14,700]  ~  0.57  . 

A  comparison  of  (15)  and  (16)  shows  the  added  uncertainty  in 
the  forecast  due  to  randomness  in  the  numbers  of  admissions. 


13 


REFERENCES 


[1]   Grinold,  R.C.,  and  Marshall,  K.T.   Manpower  Planning  Models, 
Elsevier-North  Holland,  New  York  (1977) . 

[2]   Marshall,  K.T.,  and  Oliver,  R.M.   "A  Constant  Work  Model 
for  Student  Enrollment  and  Attendance,"  Operations 
Research,  Vol.  18,  pp.  193-206  (1970) . 

[3]   Marshall,  K.T.,  Oliver,  R.M.,  and  Suslow,  S.,  "Undergraduate 
Enrollment  Attendance  Patterns,"  Report  No.  4, 
Administrative  Studies  Project  in  Higher  Education, 
Office  of  Institutional  Research,  University  of 
California,  Berkeley,  Ca.  94720  (1970) . 

[4]   Suslow,  S.,  Langlois,  E.,  Sumariwalla,  R. ,  and  Walther,  C. , 
"Student  Performance  and  Attrition  at  the  University 
of  California,  Berkeley,"  Office  of  Institutional 
Research,  University  of  California,  Berkeley,  Ca . 
94720  (1968) . 


19 


INITIAL  DISTRIBUTION  LIST 

NO.  OF  COPIES 

Dean  of  Research  1 

Code  012 

Naval  Postgraduate  School 

Monterey,  CA   93940 

Defense  Documentation  Center  2 

Cameron  Station 
Alexandria,  VA   22  314 

Library,  Code  0212  1 

Naval  Postgraduate  School 
Monterey,  CA  9  39  40 

Library,  Code  55  1 

Naval  Postgraduate  School 
Monterey,  CA   9  3940 

Librarian  1 

Krannert  Graduate  School 

of  Ind.  Engineering 
Purdue  University 
Lafayette,  Indiana   4790  7 

Librarian  1 

Department  of  Administrative  Science 
Yale  University 
2  Hillhouse  Ave. 
New  Haven,  CT   06520 

Order  Librarian  1 

Graduate  School  of  Business  Administration 
University  of  Michigan 
Ann  Arbor,  MI   4810  4 

Cowles  Foundation  Library  1 

Box  2125  Yale  Station 

Yale  University 

New  Haven,  CT   06520 

Social  Systems  Research  Library  1 

6470  Social  Science  Bldg 
University  of  Wisconsin 
Madison,  WI   53711 


20 


NO.  OF  COPIES 


Receiving  Station  Library  1 

University  of  Texas 
P.O.  Box  30365 
Dallas,  TE   75230 

Naval  War  College  1 

Attn:  Library 
Newport,  R.I.   02840 

Dr.  Richard  C.  Grinold  1 

Graduate  School  of  Business 

Barrows  Hall 

University  of  California 

Berkeley,  CA   94720 

Dr.  David  Hopkins  1 

Academic  Planning  Office 
Stanford  University 
Stanford,  CA   94  30  5 

Dr.  Robert  M.  Oliver  1 

Operations  Research 
Etcheverry  Hall 
University  of  California 
Berkeley,  CA   94  720 

Dr.  Thomas  C.  Varley  1 

Code  4  34 

Office  of  Naval  Research 

Arlington,  VA   22217 

Office  of  Institutional  Research  1 

University  of  California 
Berkeley,  CA   9  4  720 

Naval  Postgraduate  School 
Monterey,  CA   9  394  0 
Code  55 

Attn:   K.  T.  Marshall,  55Mt  20 

R.  J.  Stampfel,  55  1 

J.  R.  Borsting,  Code  01  1 


21 


Ui«7372 


ISSSmSS  LIBRARY  ■  RESE*RCH  REPORTS 


5  6853  01062200  4 


